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1 Introduction 



Suppose we are given observations (%)i<,< n which obey the heteroscedastic 
regression equation 

where design points Xj = j/n, S(-) is an unknown function to be esti- 
mated, (Cj)i<j< n is a sequence of i.i.d. random variables, E£i = , E£f = 
1 , (o"j)i<j< n are unknown volatility coefficients, which may depend on design 
points and on unknown regression function S. 

The models of type (II. ip with a ■ = cr-{xj) were introduced in [JJ as a gen- 
eralisation of the nonparametric ANCOVA model of [JS]. It should be noted 
that heteroscedastic regressions with this type of volatility coefficients have 
been encountered in econometric studies, namely, in consumer budget stud- 
ies utilizing observations on individuals with diverse incomes and in analyses 
of the investment behavior of firms of different sizes (see [12J). For example, 
for consumer budget problems one uses there (see p. 83) some parametric 
version of model (11. ip with the volatility coefficient defined as 

^ = c + c 1 x j + c 2 S 2 (x j ), (1.2) 

where c , c x and c 2 are some nonnegative unknown constants. 

Moreover, this regression model appears in the drift estimation problem 
for stochastic differential equations when one passes from continuous time 
to discrete time model by making use of sequential kernel estimators having 
asymptotically minimal variances (see {6], [8] -[TO]). 

The volatility coefficient estimation in heteroscedastic regression was con- 
sidered in a few papers (see, for example, (3] and the references therein). By 
making use of the squared first-order differences of the observations the ini- 
tial problem in that paper was reduced to the regression function estimation 
in the model of type (II. ip . 

In this paper we develop the approach proposed in [7]. The first goal of 
the research is to construct an adaptive procedure for estimating the function 
S which does not use any smoothness information of S and which is based on 
observations (yj)i<j< n and further to obtain a sharp non-asymptotic upper 
bound (oracle inequality) for a quadratic risk in the case when the smoothness 
of S is unknown. The second goal is to prove that the constructed procedure 
is efficient also in the asymptotic setup. 

Problems of constructing a nonparametric estimator and proving a non- 
asymptotic upper bound for a risk in homoscedastic model, that is when 
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o- = a, were studied in few papers. A non-asymptotic upper bound for a 
quadratic risk over thresholding estimators is given in [IB] . In papers [2J, 
[15] an adaptive model selection procedure has been constructed. It is based 
on least squares estimators and a non-asymptotic upper bound has been 
obtained for a quadratic risk which is best in the principal term for the given 
class of estimators when the noise vector ... , £ n ) is gaussian. This type 
of upper bounds is called the oracle inequality. In [5] the oracle inequality 
has been obtained for a model selection procedure based on any estimators 
in the case when the noise vector (£1, • • • , £n) nas a spherically symmetric 
distribution. Moreover, some sharp oracle inequalities have been obtained 
also for homoscedastic regression with gaussian noises, see, for example, [H]. 
Here the adjective "sharp" means that the coefficient of the principal term 
may be chosen as close to unity as desired. 

In the paper for heteroscedastic regression an adaptive procedure is con- 
structed for which the sharp non-asymptotic oracle inequality is proved. It 
should be noted that the methods used in former papers to obtain the sharp 
oracle inequality in regression models are limited by the homoscedastic case 
since they are based on the fact that an orthogonal transformation of a noise 
gaussian vector (£1, . . . , £ n ) gives a gaussian vector. In heteroscedastic regres- 
sion models under consideration these methods are not valid since the noise 
vector is not gaussian. To obtain sharp non-asymptotic oracle inequalities in 
the heteroscedastic case the authors develop a new mathematical tools based 
on "penalty" methods and Pinsker's type weights. 

Moreover, in |TT] we show that the given adaptive estimator is efficient 
in the asymptotic sense, that is, the sharp asymptotic lower bound is proved 
for a quadratic risk and it is attained over this estimator. The sharp non- 
asymptotic oracle inequality plays a cornerstone role in proving the asymp- 
totic efficiency. To obtain the optimal upper bound for the risk one should use 
a weighted least squares estimator with weights depending on the smooth- 
ness of the unknown regression function. The smoothness being unknown, 
one can't use directly this weighted least squares estimator giving the mini- 
mal upper bound. The given sharp non-asymptotic oracle inequality allow us 
to replace this unknown weighted least squares estimator with an adaptive 
estimator. The risk of the adaptive estimator is less than the risk of optimal 
(unknown) weighted least squares estimator up to additive and multiplica- 
tive constants. Taking in account that the multiplicative constant tends 
to one and the order of the additive constant is less then the order of the 
convergence rate, we obtain that the risk of the given adaptive procedure 
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asymptotically coincides with the risk of the optimal (unknown) weighted 
least squares estimator. Therefore, given the optimal lower bound, we ob- 
tain the asymptotic efficiency of the adaptive procedure satisfying the sharp 
non-asymptotic oracle inequality. 

The paper is organized as follows. In Section 2 we construct an adaptive 
estimation procedure based on weighted least squares estimators and we ob- 
tain a non-asymptotic upper bound for the quadratic risk. In Section 3 we 
propose an estimator for the integrated noise variance and give the oracle 
inequality in the case of Sobolev space, S G W k . The proofs are given in 
Section 4. Section 5 contains a numerical comparison of the given procedure 
with an adaptive procedure proposed in [3]. The Appendix contains some 
technical results. 



2 Oracle inequality 

In this paper we study the non-asymptotic estimation problem of the function 
S in the model (11. ip by observations (y 3 -)i<i<n with odd sample number n. 
We assume that in ( II. ip the sequence (Cj)i<j< n is i-i-d.- with 

E£i = , E£ 2 = 1 and E£ 4 = f* < oo . (2.1) 

Moreover, we assume that (c z ) 1<a<n is a sequence of positive random 
variables independent of (&)i<i<n and bounded away from +oo, i.e. there 
exists some nonrandom unknown constant a* > such that 

max a, 2 < cr* . (2-2) 

l<l<n 

For any estimate S n of S based on observations (%-)i<j<th the estimation 
accuracy is measured by the mean integrated squared error (MISE) 

E s \\S n -S\\ 2 n , (2.3) 

where 

~ i n _ 

\\s n -s\\l = (S n -S,S n - S) n = - J2( s n(*i) - s(*i)) 2 ■ 

i=i 

We make use of the trigonometric basis {4>j)j>\ in ^[0, 1] with 

0! = 1 , <P 3 (x) = V2T r:j (2n[j/2}x) ,j>2, (2.4) 
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where the function TrAx) = cos(x) for even j and Tr-{x) = sin(x) for odd 
j; [x] denotes the integer part of x. Note that if n is odd, then this basis is 
orthonormal for the empirical inner product generated by the sieve (x J ) 1<:;<n , 
that is for any 1 < i, j < n, 

1 n 

{<t>i , <Pj)n = ~Y1 M x i)M x i) = S ij > ( 2 - 5 ) 
1=1 

where 8^ is Kronecker's symbol. 

By making use of this basis we define the discrete Fourier transformation 
in (11. ip and obtain the Fourier coefficients 

hn = {Y^ j ) n and iiB = (£,&)„. (2.6) 

Here Y = (y l , . . . , y n )' and S = (S , (x 1 ), . . . , S(x n ))'. The prime denotes the 
transposition. 

From ( II. ip it follows directly that these Fourier coefficients satisfy the 
following equation 

1 1 n 

We estimate the function S by the weighted least squares estimator 

n 

S x (x) = J2KjAJ j (x), (2.8) 

3=1 

where x G [0,1], the weight vector A = (A(l), . . . , X(n))' belongs to some 
finite set A from [0, l] n . We denote by u the cardinal number of the set A. 

Now we need to write a cost function to choose a weight A G A. Of course, 
it is obvious, that the best way is to minimize the cost function which is equal 
to the empirical squared error 

ETT n (X) = \\S x -S\\ 2 n , 

which in our case is equal to 

n n n 

Err n (A) = £ A 2 (^ n - 2 £ \{j)9 j>n 9 j>n + £ B) n . (2.9) 
j=i j=\ j=i 
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Since coefficients 6- n are unknown, we need to replace the term 9 • 6- n by 
some estimator which we choose as 



where ^ is some estimator of the integrated noise variance 

n 

?i» = n" 1 of . (2.10) 

i=i 

Such type of estimators is given in (I3.5P . 

Moreover, for this substitution to the empirical squared error one needs 
to pay a penalty. Finally, we define the cost function by the following way 

n n 
3=1 3=1 

where p is some positive coefficient which will be chosen later. The penalty 
term we define as 

- IAI 2 c - 

P "' (A)= n With l A | 2 = E A2 ^)- ( 2 - 12 ) 

3=1 

Note that in the case when the sequence (crJ 1<Kn is known, i.e. q n = q n , we 
obtain 

P B (A) = l^i. (2.13) 

We set 

A = argmin AgA J n (A) (2.14) 
and define an estimator of S as 

£ = (2.15) 

We recall that the set A is finite so A exists. In the case when A is not unique 
we take one of them. 

To formulate the oracle inequality we introduce, for < p < 1/3, the 
following function 

r (p) = — + 4u hn (l + u^pi + AvvJtjL . (2.16) 



Here and thereafter we make use of the following notations: for i = 1, 2, 

n n 

v n = max V X(j) and u i n = max sup | V X l (j) (<fe,) - 1)| , (2.17) 
^eA / — ' AeA 1<Kn f-' J 

j=i — j=i 



Theorem 2.1. Lei A be any finite set in [0, l] n . For any n > 3 and 

< p < 1/3, the estimator satisfies the oracle inequality 

V 8 \\S.-S\\l < 1 + * P _ ~* p2 minE g ||£ A - St + I BM , (2.18) 

w/iere £ n (p) = + «(p)i; n E 5 |? n - <j n | wift 

p(l-p)T;(p) + 2z/ + 2p 2 (l-p) M2i 



p(l - 3p) 



andK(p) =4(l-p 2 )/(l-3p). 

//in t/ie model ( 11.11) £/ie volatility coefficients (cJ 1<Krj are known, then 
q n = q n and inequality (12 . 1 8[) /ias £/ie following form 

Esll^ " < mmE 5 ||S A - + . (2.19) 



Remark 2.1. iVote i/ia£ i/ie principal term in the right-hand side of ( 12. 18j) — 
(12. 191) is fresi in £/ie c/ass o/ estimators (S x , A G A). Inequalities of such 
type are called the sharp non-asymptotic oracle inequalities. TTie inequality 
is sharp in the sense that the coefficient of the principal term may be chosen 
as close to 1 as desired. Similar inequalities for homoscedastic models (II .ip 
with o l = a were given, for example, in [T$ . The methods used there cannot 
be extended to the heteroscedastic case since, after the Fourier transforma- 
tion, the random variables (^ n ) in model ( 12. 7ft are dependent contrary to the 
homoscedastic case, where these random variables are independent (see, for 
example, JT^) . 
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Remark 2.2. If one would like to obtain the asymptotically minimal MISE 
of the estimator S^, then the secondary term B n (p) in (I2.18P should be slowly 
varing, i.e. for any 7 > ; 

B n {p)/n J ->■ °> as n^oo. (2.20) 

Indeed, since usually the optimal rate is of order n 2k ^ 2k+1 ^ for some k > 1, 
then after multiplying the inequality (12.1 8p by this rate the principal term 
gives the optimal constant and the secondary one is of the type 

Therefore the property (I2.20P provides the asymptotic vanishing, as n — » oo ; 
the secondary term B n (p) for k > 1. To obtain the property (I2.20p . it suffices 
that, for any 7 > 0, 

pn 7 +00 , 

and 

— ■ '■ >0, as n-^-oo, (2.21) 

n 1 

thanks to definitions of B n (p) and ^ n (p). To obtain the first convergence it 
suffices to take the parameter p as p > g n , where g n is a slowly decreasing 
function, i.e. 

lim g n = and for any 7 > lim n 1 g n = +00 , (2.22) 

n— >oo n— >oo 

for example, g — 1/ In n. For the second convergence the choice ofu 1 n ,u 2n , v n 
and of the estimator q~ n is proposed below. 

Consider now the order of the termes v n ,u ln ,u 2 n and the function ^ n (p) 
in the case when the finite set A is formed by a special version of Pinsker's 
weights (see, for example, [16J). To this end, we define the sieve 

A £ = {l,...,k*}x {*!,..., t m }, 

where t{ = ie and m = [l/^ 2 ]- We suppose that the parameters k* > 1 and 
< e < 1 are functions of n, i.e. k* = k* and e — e„, such that, 

lim^^^ k* n = +00 , lim n ^ oc k*J In n = , 

(2.23) 

lim^^^ e n = and lim^^ ri 1 e n = +00 , 
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for any 7 > 0. For example, one can take for n > 3 



e n = l/ln7i and k* n = k + vTnn , (2.24) 

where is any nonnegative constant. 

For any a = (/3, £) G *4. e we define the weight vector X a = (A Q (1), . . . , X a (n))' 

as 

KU) = hi< 3 < k)} + (1 - UM") hi <i<»«} ■ ( 2 - 25 ) 

Here j = j (a) = [uj Inn] with 

u a = UJ+(A,tn)^\ 
where oJ is any nonnegative constant and 

A /3 = (/3 + l)(2/3 + l)/( 7 r 2 ^). 

Hence, 

A = {X a ,aeA £ } (2.26) 
and v — v n — k* n m n . Note that in this case, in view of (12. 23 p . for any 7 > 

lim Vn/n 1 = . 

n— >oo 

Moreover, by ( 12T23]) 



E A ^) = 1 oo>i}^ + i { ^>i } E { l -W"«f)< 

i=i i=io+ 1 
Therefore, taking into account that < Ai < 1 for /3 > 1 we find that 

v n <uj + {n/efl\ 

i.e. 

lim sup < 00 . (2.27) 

>n 



Moreover, note that for any 1 < I < n, we get 

E *«0") - !) = E - !) 



+ Wn E (i-(ikfl(^)-i) 
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Thus Lemma IA.2I implies that 

u hn < 1 + 2 /m < 1 + 2 k * +1 . 

Due to the condition for k* in (I2.23P this function is slowly varying, i.e. for 
any 7 > 0, 

lim u x Jn< = . (2.28) 
By the same way we obtain that 

u 2>n < 1 + 2 fc *+ 2 + 2 2k ' +1 
and, therefore, for any 7 > 

lim u 2 Jn~< = . (2.29) 

Then for any sequence (g n ) n>1 satisfying properties (I2.22p and for any 7 > 0, 

lim sup \l/ n (p)/n 7 = . 

n_s>0 ° 0„<P<l/6 



Remark 2.3. As we shall see in the proof of Theorem \2.1l the oracle in- 
equality is true for any basis and any design (x k ) 1<k<n which possesse the 
orthonormality property (12. 5p . Moreover, if the sequences (I2.17P and an es- 
timator of the unknown integrated variance <> n satisfy the property (I2.2ip . then 
the secondary term in the inequality (I2.18P possesses the property (12. 20 p . In 
the next section we give an estimator ^ n in the case of the trigonometric basis 
and the equidistant design. 



3 Oracle inequality for S G W k 

Assume that S : R — > R is a k times differentiable 1-periodic function such 
that 

k 

E w sU) w 2 < r > (3- 1 ) 

3=0 
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where 



f(t)dt. (3.2) 

We denote by W k the set of all such functions. Moreover, we suppose that 
r > and k > 1 are unknown parameters. 

Note that, the space W k can be represented as an ellipses in the Hilbert 
space, i.e. 

oo oo 

W k r ={5g£ 2 [0,1] : S = d j ( f ) j such that Yl a 3 B ) - r } ' ( 3 ' 3 ) 

j=l j=i 

where the basis functions (4>j)j>i are defined in (12.41) ; (0-),->i are the Fourier 
coefficients, i.e. 

j = {S,t/> j )= [ SiWffldt. (3.4) 
The coefficients (Oj-) 3 >i are defined as 

k k 

«i = Ell^ll a = D^b72]) !a . 

z=o z=o 
To estimate we make use of the following estimator: 



n 



where the parameter c? n , 1 < c? n < n — 1 , will be chosen later. 
In Section H] we show the following result. 

Lemma 3.1. For any n > 2, r > and S G W 1 

e ic cl< 2jveW2>^w 
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If we choose the parameter d n in (I3.5p such that 

lim dj \fn = and lim d 2 / \fn = oo , (3.7) 

we obtain that 

lim ?*(r) = . 

n— Yoo 

Theorem 12.11 and inequality (13. 6 j) imply immediately the following result. 

Theorem 3.2. Let A be any finite set in [0, l] n . Assume that in the model 
(II. ip i/ie function S belongs to W 1 . Then, for any n > 3 and < p < l/3 ; 

i/ie procedure from ( 12. 15ft wii/i ^ defined by (13. 5 j) and (13. 7ft satisfies the 
following oracle inequality 

E S \\S>-S\\1< 1 + / P 7 V minE s ||g A - 5^ + -P w (p, r) , (3.8) 
1 — 3p AeA n 

ui/iere 

r) = * B (p) + (2 (Ve + v") <r, + <(r)) ^ . 

Moreover, if the sequencies (I2.17P satisfy the properties (I2.27P - (I2.29P then, 
for any 7 > 0, 

lim sup T> n (p, r)/n 7 = , 

where g n is any slowly decreasing sequence, i.e. satisfying ( 12.22ft . 

Remark 3.1. Tne inequality (13. 8p ■used to prove the asymptotic effi- 
ciency of the estimator (12.151) fsee, To obtain the minimal asymptotic 
quadratic risk, one has to take a weighted least squares estimator ( 12. 8 j) win 
weights of type ( 12.25)) . where the parameter a depends on unknown smooth- 
ness of unknown function S . So one can't use this estimator directly. The 
oracle inequality allows us to overcome this difficulty because the upper bound 
is majorized up to a multiplicative and a additive constants by the minimal 
quadratic risk over all weighted estimators including the optimal one. Taking 
into account that the multiplicative constant tends to unity, as n — )■ 00 , and 
the additive constant is negligible in comparison with any degree of n and, 
in particular, with the optimal convergence rate and then multiplying the in- 
equality ( 13. 8 p by the optimal convergence rate, one obtains the asymptotically 
minimal upper bound for the quadratic risk of the estimator ( 12 .15)) . The last 
result means that this estimator is asymptotically efficient. 
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4 Proofs 



4.1 Proof of Theorem 12.11 

First of all, note that we can represent the empirical squared error Err n (A) 
by the following way 

n 

Err n (A) = J n (A) + 2^ \{j)9' jn + \\S\\ 2 n - pP n (X) (4.1) 
with 9'. = 0.„ — 6a„6a„. The second term is most difficult to handle in the 

J,7l Ji' t J;"- ji' L 

right-hand part of (14. ip . To estimate, we decompose it in the sum of terms 
and we apply appropriate technique to each of them. By setting 



1 n 

S> = E sC = -£^>) ^ Hn = i\n-^ (4-2) 
1=1 

we find that ^ 

^> = ^ d 3,n^j,n + ~^j, n + ~ (S> ~ Q ■ ( 4 - 3 ) 

Moreover, we can represent \i ■ as 

-i n n 

Hn = ~ E + 2 E ^ ^ = 4,» + 2 ^,n ( 4 ' 4 ) 

2=1 Z=2 

with r/; = £, 2 — 1 and 

1 ' _1 
r U = -vrfjipi) E a k<l>j( x k) ffc • 
fc=i 

Now we set 

n -.n 

iv'(A) = e aok» and ^^E^^ 1 ^}' ( 4 - 5 ) 

where A(j) = A(j)/|A|. In the Appendix we show that 

sup E s |iV'(A)| < a*(v n + u J^I (4.6) 
AeA V n 
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and 

sup E 5 (iV"(A)) 2 < ^ . (4.7) 



Now, for any A G A, we rewrite (14. ip as 



Err„(A) = J n (A) + -N'(X) + 4^P n (X)N"(X) 
+ 2M(X) + -A(X) + \\S\\ 2 -pP n (X) 



n 



where P n (X) is defined in (12.131) . 



A(A) = (S> - ?J and M(A) = n^ 2 £ A(j)U,» • ( 4 ' 8 ) 

3=1 3=1 

Further we estimate the term A (A). Setting 



n 

we obtain that 



i=i 



|a(a)| < i ^Aox-j + ^-g 

i=i 

<o*u^ n + v n \q n -Sn\- ( 4 - 10 ) 
Now from (14. ip we obtain that, for some fixed Ao G A, 

ErrJA) - Err„(A ) = J (A) - J(A ) + 2M0) + 

+ Ayfp~$)N"(X) - 4^Pj\)N"(X ) 
- pP n (A) + pP„(A Q ) + ^ (A(A) - A(A : 

where $ = A — A . 

By the definition of A in (12.141) and by (I4.10p we get 

4a* Ul +4v n \q n - <; n \ 



Err n (A) - Err n (A ) < 2M(tf) + 



n 



+ -N'(#)+^P n (X)N"(X)-pP n (X) 



+ P P n (X )-A^PjXo)N"(Xo). 
14 



Moreover, making use of the inequality 

2\ab\ < ea 2 + e- l b 2 (4.11) 

with e = p/4 and taking into account the definition of the penalty term in 
(I2.12p we deduce, for any A G A, 

aVpjX)\n"(X)\ < pp„m + 4 (iV "| ) A))2 

n p 

Thus from here it follows that 

Err n (A) < Err n (A ) + 2M0) + T n + 2pP n (\ ) , (4.12) 

where 

y> 4 8 2 4(7^ 4 + 2p 
n 1 p 2 n n 

with JV* = sup AgA \N'(X)\ and JV* = sup AeA |iV"(A)|. Moreover, note that 
the bounds (TO . (jt?]) and (fCTOj) imply that 

E 5 T n < + i±^« B E 5 ft - ,J , (4.13) 

where the function T*(p) is defined in (12.161) . 

Now we study the second term in (14. 8ft . First, note that for any nonran- 
dom vector i? = . . . , $(n))' G M n Lemma [A.4I implies 

n- " II C 112 

E 5 M 2 W < £ £tf 2 (j)^ n = ^™ , (4-14) 



where 



We set now 



nM 2 (tf) , , 
Z* = sup ^ with A x = A - A 



15 



To estimate this term we apply the inequality (I4.14j) . i.e. 



V S Z* < £ |fc 112 ^ w* • (4.15) 



Moreover, making use of inequality ( 14. lip with e = p||Stf|| n , we get 



,2 Z* 



2\M{$)\<p\\SX J r — ■ (4-16) 
n np 

Now we estimate US',,!! 2 . We have 

n 

ll^lln - H^Hn = E ^Wln ~ ^n) < ^(tf) (4.17) 

with 

1 n 

3=1 

Now, taking into account that |"#(j)| < 1 for any $ G A l5 we obtain 

E s M^)<aJ^^. 

Putting 

Z l = SU P II Q 112 ' 

we get 

E S Z* < z/a* . (4.18) 

Therefore, applying inequality (14. 16j) for M 1 (t9) in (14.171) we deduce the upper 
bound for ||5^|| 2 , i.e. 

\\SJ 2 < t^—\\SJ 2 + —rr — (4-19) 
ii m n - i _p H np{\- p) y ! 

Taking into account this inequality in (14 .16}) we obtain that 



1 -p" # " n np(l-p) 
< 2p(Err w (A) + Err n (A )) Z* + Z* 



1 - p np{\ - p) 
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Therefore (I4.12p implies that 



ErrJA)<i±|-ErrJA ) + i^-T„ 

np(l-3p) + l-3p " lA ° j ' 



Now by inequalities f l4.15p - p.18p we get that 

E s Err n (A) < i^E 5 Err n (A ) + -^E fl T B 

+ np(l - 3p) + "T^3p" EsPJAo) • 

By making use of inequality ( I4.13P and Lemma I A. II we come to Theorem 12.11 
□ 



4.2 Proof of Lemma 13.11 

First notice that from f |2.Tjl we obtain that 



3=d n +l V j=d n +l 



n n , 

d n 

c. — — c 



3=d n +l 3=d n +l 

2 a 1 a 1 a d n 

:= A, + — A 2 + -A 3 + -A 4 - -a ^ , 

Jn n n n 



where // • and q • are defined in H4.3[) and (14. 9p respectively. 

We estimate the first term by Lemma IA.3I for S G . We have 

4r 

At < — . 

n 

The next term we estimate with the help of Lemma IA.4I We get that 

,2 - ■ - 4r 



E S (A 2 )' < (7*A 1 < a,— . 
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By f)4.4p and (I4.5p we can represent A 3 as 

A 3 = iV'(A / ) + 2|A / | v ^:A r,/ (A / ) 

with the vector A 7 = (A 7 (l) , . . . , A J (n)) / having the indicator components, 
i.e. = l{j>d }• By estimating in (lA.lj) <fi 2 by 2 we obtain 



E s \N'(Xj)\ < 2a, Ve^- 
Thus the upper bound ( 14 .7h implies 

E 5 |A 3 | < 2<7»( v / ?* + \ / 2)V^ = vVn- 
Moreover, due to Lemma [A. 21 with k = 0, one has 



i=l j=d n +l 



i=i 



E(^)-i) 



<7* 



X>^)-i; 



< 2a*. 



Hence Lemma [3.11 □ 



5 Numerical example 

In this Section, we compare via simulations the adaptive procedure proposed 
for the heteroscedastic regression in jl] (section 4.1) with that of (12. 15ft . 
Consider the model ( 11.11) with 

S(x) = x sin(27rx) + x 2 (1 — x) cos(47rx) and a 2 = 1 + S 2 (x j ) , 

assuming that (£ fe ) fc>1 follow the gaussian distribution with zero mean and 
unit variance. 

In the procedure (I2.15P we take the weight vectors defined in (I2.25P with 
k* = 100 + v^hTn, e = 1/ In n, m = In 2 n, p = 1/(3 + In 2 n) and 

W/3 = io + (v^) 1/(2/3+1) - 

Moreover, we make use of the estimate (13.51) with d n = n 1 / 3 . 
The results of simulations are reported in Table 1. 
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Table 1 



E\\s*-S\\ 2 

II ll n 


E\\S-Sf n 


n 


0.260 


0.410 


21 


0.148 


0.427 


41 


0.058 


0.476 


101 


0.034 


0.430 


201 


0.019 


0.448 


401 



The columns of Table 1 with the headings E\\S* — S\\^, E \\S — S\\ 2 and n 
report, respectively, the empirical quadratic risk for the procedure f)2.25p . the 
empirical quadratic risk for the procedure from [4] and the sample size. To 
calculate the empirical risks 50 independent Monte Carlo simulations were 
performed. 

Table 1 shows that in comparison with the procedure from [4] our adaptive 
estimator performs resonably well for the small sample sizes. 



6 Appendix 



A.l Proof of (USD 

First note that we can represent the term N'(X) as 



2 n 



iV '( A ) = E^^ with ^ = -^E A ^>? 

ii j 

i=i j=i 

Recalling that = f - 1 we calculate 



x, 



BsTO) 



^2 C~l 



rr 



i=i 



Therefore for any vector A G 



E s \N'(X)\<a^ max | 



(A.l) 
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Thus taking into account here definitions (12.1 7p we come to inequality (14. 6p . 
□ 



A.2 Proof of ( ED 

By putting g x = YTj-\ ^U) T j,i an d taking into account that the random vari- 
ables (£fc)i<fc<„ are independent of (o~ k ) 1<k<n we obtain that 

tn \ ~ 1 n 

X>* a (A - 2) 
fc=l / Z=2 

where 

cr 2 ' 1 / n 



fc=i \j=i 



Therefore the orthonormality property (12.51) implies that for any A G 



2 n / n 

k=i \j=i 



-4 t j2 m^ < 2 -^o 



2 



Now by making use of this inequality in (1A.2j) we get (14.71) . 
□ 



A. 3 Technical lemma 
Lemma A.l. For any n > 1 and A £ A, 



E 5 P n (A) < E 5 Err n (X) + -^E s |?„ - , n \ + 
Proof. Indeed, by the definition of Err n (A) we have 

Err„(A) = £ f (A(j) - 1)^„ + Kj)^, n ) ■ 
i=i 
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Therefore, 



1 n 1 n 

E s Err n (A) > E s - £ A 2 (,) ^ = E g - E A 2 (j) 



where the sequence (<y.- n ) is defined in (14. 2p . Moreover, note that the last 
term can be estimated as 



n n 



s,) - i; 



We recall that the definition of the set A and the definition of v n in (I2.17P 
imply that |A| 2 < v for A G A. Therefore for any A G A 



* U 2,n 



|A| Ic, - c 



E^(j)S,n>l^-^ 



Hence the desired inequality □ 



A. 4 Properties of trigonometric basis 

Lemma A. 2. For any k > 0, 



sup sup N k 

N>2 x£[0,l] 



N 



E /fc W^) - x ) 



Z=2 



< 2 A 



(A.3) 



Proof. Due to the properties of the trigonometric functions, we get 



N 



1=2 



E ww - 1) = E ( 2/ ) fc cos ( 47r/2; ) 

l<J<AT/2 

E ( 2/ + 1 )' £ cos(4tt/x) . 

l<i<(JV-l)/2 
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This yields 



N 



E {k (tit?) - !) 



Z=2 



< 



E ((2/ + l) fc - (2/) fc ) cos(4tt/x) 

l<Z<(JV-l)/2 
< E (( 2/ + " ( 2/ ) fc ) + Nk 



N 



l<Z<(jV-l)/2 



E E ( 2/ ) J + ^ 



l<i<(JV-l)/2 j=0 



This implies ( )A.3j) . 
□ 

Lemma A. 3. For any function S G Wj . , 



sup sup m 

n>l Km<n— 1 



2k 



y e 2 

\j=m+l 



< 



4r 



(A.4) 



Proof. First, note that any function S from can be represented by its 
Fourier series, i.e. S = YljLi^j^j with the coefficients defined by (13. 4p . By 
denoting the residual term for S as 



A m (x) = s - E^- = E 

j=l j=m+l 



X) 



we obtain that 



E 9] n = inf ||S-E«^H« < IIAJfc- 

^ — ' ■>>" ai,...,a m * — ' 
j=m+l j=l 



Moreover, it is easy to deduce that 

k=l k=l ^ x k-i 

ri _ n _ r x k 

< 2 / A ? 2 n (x)da; + 2^ (A m (x k ) - A m (x)) 2 dx . 



k=i " x k-i 
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The last term in this inequality we estimate as 

(A m (x k ) - A m (x)) 2 = (J^ " A m (z)dz 

< n- 1 [ " (A m (z)) 2 dz. 



Therefore, 



2 

I A II 2 < 211 A II 2 -I II A II 2 

oo 2 00 



2 e e nm 



rr 

j=m+l j=m+l 



Now note that by the representation of the set W k in the form (13. 3p we can 
estimate the first term in the last inequality as 



oo 

,2 a j . r r 



3 3 aj a m+1 (nm)™ 

Similarly, we find that 

f e\\6 f < sup l ^lr < sup ML r < r 
Therefore, for m < n we get that 

oo 

n2 II 1 II 2 



j=m+l 

This implies ( IA.4j) . □ 

Lemma A. 4. Let £„ n &e defined in (12. 7ft /or i/ie model (II. li . Then, for any 
real numbers f l ,...,f n , 

(n \ 2 n 

E^» ^*E^ ( A - 5 ) 
i=i / j=i 
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Proof. Due to the definition of £„•„, one has 

n n \ n 

E = X>£& with ^=-^E^^)- 

i=i 2=1 v j=i 

Moreover 

E/,«J = E * »• E # 

n 

= cr* E ^ ^ ' 

The orthogonality of the basis (</> •) implies inequality (1A.5[) . Hence Lemma rA.4l 
□ 
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